158
13
Useful Algorithms
with r Subscript nrn, and the problem is solved. This example is a well-defined procedure that
leads automatically to the desired result.
An operation frequently required in bioinformatics is sorting a collection of items
(an array of elements), implying arranging them in increasing (or decreasing) order.
The so-called bubble sort (elements “float” to the top of the array) is considered to
be the simplest sorting algorithm. Each element is compared pairwise to each other.
If a pair is found to be in the incorrect order, the two elements are interchanged. The
algorithm is based on two DO-loops (repeat the instructions within the loop for a
preset number of times, or until some condition is fulfilled), one nested inside the
other. The outer loop runs from 1 to1 minus left parenthesis1 −(the length of the array), and the inner loop
runs from 1 plus left parenthesis1 + (a counter of the outer loop) up to the length of the array.
This sort algorithm is not particularly efficient, in the sense that algorithms requir-
ing fewer instructions to accomplish the same task are available. Often these more
efficient algorithms take more time to program, however. For example, the fast
Fourier transform does indeed require significantly fewer instructions than the ordi-
nary Fourier transform, but nowadays, with the almost universal availability of per-
sonal computers, provided the dataset being transformed is not too large, the extra
work of programming might not be worth the bother. Most personal computers are
switched off at night when they could actually be calculating. The Intel Pentium chip,
introduced around 1994, can carry out more than 100 MIPS (million instructions per
second); this is 5 times faster than the 486 chip, introduced around 1992, and 100
times more than the mainframe DEC VAX 780, introduced in 1977, and for more
than a decade the workhorse of many computing centres; the Cray 1 supercomputer
(1975) could already execute 160 MIPS. The DEC PDP1, introduced around 1960
and also very widely encountered in its day, carries out 0.1 MIPS. IBM’s Deep Blue,
introduced in 1996, can accomplish 10 Superscript 6106 MIPS. Current processors such as the top
of the range Intel Core i7 4770k operate at nearly 130,000 MIPS, while the Blue
Gene Q supercomputer with thousands of processors is capable of 20 petaFLOPS
(i.e.,2 times 10 Superscript 162 × 1016 floating point operations per second which, depending on architecture,
could be many tens of MIPS). The newer Intel chips found in laptops are very pow-
erful: the core i9 (2018) achieves about 410,000 MIPS. When computing jobs were
processed batchwise on a mainframe device there was, of course, strong pressure to
achieve operations such as sorting and matching with as few instructions as possible;
but when the ubiquitous personal computer has 100 times more processing power
than a VAX 780, the effort of achieving it may be considered superfluous by all who
are not professional programmers.
Problem. Write a program to implement the bubble sort algorithm in a high-level
computer language.
Problem. Write an algorithm for searching for all occurrences of a particular word
(a substring) in a string and returning the distance of each occurrence from the start
of the string.